Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Discrete SAC compatibility with compile #2569

Merged
merged 42 commits into from
Dec 14, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Nov 15, 2024

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Nov 15, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2569

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 2 New Failures, 17 Unrelated Failures

As of commit 105440a with merge base 7d7cd95 (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following job failed but was likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 15, 2024
vmoens added a commit that referenced this pull request Nov 15, 2024
ghstack-source-id: 5b3d6a2100ad9cb96b9dd00d798ff628add59ca7
Pull Request resolved: #2569
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
Copy link

github-actions bot commented Dec 13, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}4$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.4238s 0.4232s 2.3628 Ops/s 2.2844 Ops/s $\color{#35bf28}+3.43\%$
test_transformed 0.6115s 0.6100s 1.6392 Ops/s 1.6110 Ops/s $\color{#35bf28}+1.75\%$
test_serial 1.3516s 1.3497s 0.7409 Ops/s 0.7474 Ops/s $\color{#d91a1a}-0.87\%$
test_parallel 1.3162s 1.2854s 0.7780 Ops/s 0.7616 Ops/s $\color{#35bf28}+2.15\%$
test_step_mdp_speed[True-True-True-True-True] 0.2124ms 29.7396μs 33.6252 KOps/s 33.3412 KOps/s $\color{#35bf28}+0.85\%$
test_step_mdp_speed[True-True-True-True-False] 61.4540μs 17.6074μs 56.7944 KOps/s 56.7015 KOps/s $\color{#35bf28}+0.16\%$
test_step_mdp_speed[True-True-True-False-True] 52.7580μs 16.9533μs 58.9857 KOps/s 60.0690 KOps/s $\color{#d91a1a}-1.80\%$
test_step_mdp_speed[True-True-True-False-False] 49.4420μs 10.0303μs 99.6975 KOps/s 99.3514 KOps/s $\color{#35bf28}+0.35\%$
test_step_mdp_speed[True-True-False-True-True] 70.8620μs 31.8754μs 31.3721 KOps/s 31.8187 KOps/s $\color{#d91a1a}-1.40\%$
test_step_mdp_speed[True-True-False-True-False] 75.7710μs 19.5428μs 51.1696 KOps/s 51.0579 KOps/s $\color{#35bf28}+0.22\%$
test_step_mdp_speed[True-True-False-False-True] 69.8600μs 18.6567μs 53.6001 KOps/s 53.9567 KOps/s $\color{#d91a1a}-0.66\%$
test_step_mdp_speed[True-True-False-False-False] 44.1320μs 12.0613μs 82.9095 KOps/s 85.1731 KOps/s $\color{#d91a1a}-2.66\%$
test_step_mdp_speed[True-False-True-True-True] 82.4530μs 33.7159μs 29.6596 KOps/s 30.1530 KOps/s $\color{#d91a1a}-1.64\%$
test_step_mdp_speed[True-False-True-True-False] 55.5930μs 21.6556μs 46.1775 KOps/s 46.8210 KOps/s $\color{#d91a1a}-1.37\%$
test_step_mdp_speed[True-False-True-False-True] 71.6630μs 18.5698μs 53.8508 KOps/s 53.7212 KOps/s $\color{#35bf28}+0.24\%$
test_step_mdp_speed[True-False-True-False-False] 34.2240μs 11.8807μs 84.1700 KOps/s 85.0041 KOps/s $\color{#d91a1a}-0.98\%$
test_step_mdp_speed[True-False-False-True-True] 91.0710μs 34.9414μs 28.6193 KOps/s 28.6599 KOps/s $\color{#d91a1a}-0.14\%$
test_step_mdp_speed[True-False-False-True-False] 58.5980μs 23.0238μs 43.4334 KOps/s 43.9604 KOps/s $\color{#d91a1a}-1.20\%$
test_step_mdp_speed[True-False-False-False-True] 43.7720μs 20.5411μs 48.6829 KOps/s 49.3754 KOps/s $\color{#d91a1a}-1.40\%$
test_step_mdp_speed[True-False-False-False-False] 59.3100μs 13.6690μs 73.1585 KOps/s 74.7832 KOps/s $\color{#d91a1a}-2.17\%$
test_step_mdp_speed[False-True-True-True-True] 77.4450μs 33.6855μs 29.6864 KOps/s 29.9856 KOps/s $\color{#d91a1a}-1.00\%$
test_step_mdp_speed[False-True-True-True-False] 73.9080μs 21.4180μs 46.6898 KOps/s 47.4978 KOps/s $\color{#d91a1a}-1.70\%$
test_step_mdp_speed[False-True-True-False-True] 66.9150μs 21.2584μs 47.0403 KOps/s 47.5233 KOps/s $\color{#d91a1a}-1.02\%$
test_step_mdp_speed[False-True-True-False-False] 48.5210μs 13.0866μs 76.4139 KOps/s 76.4253 KOps/s $\color{#d91a1a}-0.01\%$
test_step_mdp_speed[False-True-False-True-True] 92.3020μs 35.8970μs 27.8575 KOps/s 28.7882 KOps/s $\color{#d91a1a}-3.23\%$
test_step_mdp_speed[False-True-False-True-False] 52.3070μs 23.2949μs 42.9279 KOps/s 43.6567 KOps/s $\color{#d91a1a}-1.67\%$
test_step_mdp_speed[False-True-False-False-True] 2.7412ms 22.9242μs 43.6220 KOps/s 45.3233 KOps/s $\color{#d91a1a}-3.75\%$
test_step_mdp_speed[False-True-False-False-False] 43.0400μs 14.7757μs 67.6789 KOps/s 68.3056 KOps/s $\color{#d91a1a}-0.92\%$
test_step_mdp_speed[False-False-True-True-True] 80.3900μs 37.0099μs 27.0198 KOps/s 27.2833 KOps/s $\color{#d91a1a}-0.97\%$
test_step_mdp_speed[False-False-True-True-False] 72.8960μs 24.8217μs 40.2874 KOps/s 40.8161 KOps/s $\color{#d91a1a}-1.30\%$
test_step_mdp_speed[False-False-True-False-True] 53.8800μs 23.1874μs 43.1270 KOps/s 45.0217 KOps/s $\color{#d91a1a}-4.21\%$
test_step_mdp_speed[False-False-True-False-False] 52.4080μs 14.9030μs 67.1004 KOps/s 68.5963 KOps/s $\color{#d91a1a}-2.18\%$
test_step_mdp_speed[False-False-False-True-True] 76.1720μs 38.5737μs 25.9244 KOps/s 25.9407 KOps/s $\color{#d91a1a}-0.06\%$
test_step_mdp_speed[False-False-False-True-False] 83.5960μs 26.3176μs 37.9973 KOps/s 37.9006 KOps/s $\color{#35bf28}+0.26\%$
test_step_mdp_speed[False-False-False-False-True] 73.6070μs 24.5541μs 40.7265 KOps/s 41.1192 KOps/s $\color{#d91a1a}-0.96\%$
test_step_mdp_speed[False-False-False-False-False] 44.9140μs 16.3837μs 61.0362 KOps/s 60.9249 KOps/s $\color{#35bf28}+0.18\%$
test_values[generalized_advantage_estimate-True-True] 9.7065ms 9.3742ms 106.6754 Ops/s 105.2747 Ops/s $\color{#35bf28}+1.33\%$
test_values[vec_generalized_advantage_estimate-True-True] 40.2017ms 35.8634ms 27.8836 Ops/s 29.9773 Ops/s $\textbf{\color{#d91a1a}-6.98\%}$
test_values[td0_return_estimate-False-False] 0.2660ms 0.1777ms 5.6262 KOps/s 5.3997 KOps/s $\color{#35bf28}+4.20\%$
test_values[td1_return_estimate-False-False] 26.9550ms 23.9048ms 41.8326 Ops/s 41.8351 Ops/s $-0.01\%$
test_values[vec_td1_return_estimate-False-False] 37.9515ms 35.8301ms 27.9095 Ops/s 29.5759 Ops/s $\textbf{\color{#d91a1a}-5.63\%}$
test_values[td_lambda_return_estimate-True-False] 35.1915ms 34.7499ms 28.7770 Ops/s 28.8421 Ops/s $\color{#d91a1a}-0.23\%$
test_values[vec_td_lambda_return_estimate-True-False] 39.1763ms 35.7444ms 27.9764 Ops/s 29.7623 Ops/s $\textbf{\color{#d91a1a}-6.00\%}$
test_gae_speed[generalized_advantage_estimate-False-1-512] 12.3405ms 8.2943ms 120.5642 Ops/s 121.0901 Ops/s $\color{#d91a1a}-0.43\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.1919ms 1.9022ms 525.6978 Ops/s 503.9279 Ops/s $\color{#35bf28}+4.32\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4958ms 0.3552ms 2.8154 KOps/s 2.8306 KOps/s $\color{#d91a1a}-0.54\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 49.0011ms 47.5800ms 21.0172 Ops/s 24.6547 Ops/s $\textbf{\color{#d91a1a}-14.75\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 3.7511ms 3.0343ms 329.5652 Ops/s 329.0924 Ops/s $\color{#35bf28}+0.14\%$
test_dqn_speed[False-None] 5.7697ms 1.4026ms 712.9370 Ops/s 727.6098 Ops/s $\color{#d91a1a}-2.02\%$
test_dqn_speed[False-backward] 1.9555ms 1.8699ms 534.7883 Ops/s 538.0636 Ops/s $\color{#d91a1a}-0.61\%$
test_dqn_speed[True-None] 0.6085ms 0.4627ms 2.1610 KOps/s 2.1469 KOps/s $\color{#35bf28}+0.66\%$
test_dqn_speed[True-backward] 1.3070ms 0.9264ms 1.0794 KOps/s 1.1298 KOps/s $\color{#d91a1a}-4.46\%$
test_dqn_speed[reduce-overhead-None] 0.6293ms 0.4645ms 2.1529 KOps/s 2.1426 KOps/s $\color{#35bf28}+0.48\%$
test_dqn_speed[reduce-overhead-backward] 0.9169ms 0.8810ms 1.1351 KOps/s 1.1129 KOps/s $\color{#35bf28}+1.99\%$
test_ddpg_speed[False-None] 3.9760ms 2.8799ms 347.2376 Ops/s 345.8526 Ops/s $\color{#35bf28}+0.40\%$
test_ddpg_speed[False-backward] 4.2402ms 4.0088ms 249.4528 Ops/s 248.1622 Ops/s $\color{#35bf28}+0.52\%$
test_ddpg_speed[True-None] 1.2768ms 0.9925ms 1.0076 KOps/s 986.0654 Ops/s $\color{#35bf28}+2.18\%$
test_ddpg_speed[True-backward] 2.0174ms 1.8735ms 533.7551 Ops/s 525.2841 Ops/s $\color{#35bf28}+1.61\%$
test_ddpg_speed[reduce-overhead-None] 1.1969ms 0.9935ms 1.0066 KOps/s 993.1618 Ops/s $\color{#35bf28}+1.35\%$
test_ddpg_speed[reduce-overhead-backward] 2.0252ms 1.9047ms 525.0270 Ops/s 509.4393 Ops/s $\color{#35bf28}+3.06\%$
test_sac_speed[False-None] 9.3672ms 8.1039ms 123.3973 Ops/s 125.1969 Ops/s $\color{#d91a1a}-1.44\%$
test_sac_speed[False-backward] 11.1721ms 10.8762ms 91.9439 Ops/s 93.5054 Ops/s $\color{#d91a1a}-1.67\%$
test_sac_speed[True-None] 2.0598ms 1.8268ms 547.4007 Ops/s 543.4317 Ops/s $\color{#35bf28}+0.73\%$
test_sac_speed[True-backward] 3.7600ms 3.5498ms 281.7021 Ops/s 281.3344 Ops/s $\color{#35bf28}+0.13\%$
test_sac_speed[reduce-overhead-None] 2.0435ms 1.8389ms 543.8059 Ops/s 545.6202 Ops/s $\color{#d91a1a}-0.33\%$
test_sac_speed[reduce-overhead-backward] 3.6072ms 3.5267ms 283.5495 Ops/s 284.1973 Ops/s $\color{#d91a1a}-0.23\%$
test_redq_speed[False-None] 18.5443ms 13.0420ms 76.6756 Ops/s 77.8742 Ops/s $\color{#d91a1a}-1.54\%$
test_redq_speed[False-backward] 24.8863ms 22.2448ms 44.9544 Ops/s 45.1704 Ops/s $\color{#d91a1a}-0.48\%$
test_redq_speed[True-None] 6.2041ms 4.5821ms 218.2400 Ops/s 217.6587 Ops/s $\color{#35bf28}+0.27\%$
test_redq_speed[True-backward] 13.5642ms 12.0468ms 83.0095 Ops/s 83.4006 Ops/s $\color{#d91a1a}-0.47\%$
test_redq_speed[reduce-overhead-None] 5.6516ms 4.5757ms 218.5475 Ops/s 220.5235 Ops/s $\color{#d91a1a}-0.90\%$
test_redq_speed[reduce-overhead-backward] 12.6965ms 12.0822ms 82.7665 Ops/s 83.2154 Ops/s $\color{#d91a1a}-0.54\%$
test_redq_deprec_speed[False-None] 15.7868ms 12.7635ms 78.3483 Ops/s 78.6585 Ops/s $\color{#d91a1a}-0.39\%$
test_redq_deprec_speed[False-backward] 20.8050ms 18.4686ms 54.1460 Ops/s 54.0030 Ops/s $\color{#35bf28}+0.26\%$
test_redq_deprec_speed[True-None] 4.2605ms 3.5683ms 280.2485 Ops/s 278.2953 Ops/s $\color{#35bf28}+0.70\%$
test_redq_deprec_speed[True-backward] 8.6590ms 8.0253ms 124.6055 Ops/s 125.4141 Ops/s $\color{#d91a1a}-0.64\%$
test_redq_deprec_speed[reduce-overhead-None] 3.7282ms 3.5708ms 280.0500 Ops/s 279.8350 Ops/s $\color{#35bf28}+0.08\%$
test_redq_deprec_speed[reduce-overhead-backward] 9.3586ms 8.0208ms 124.6753 Ops/s 124.5973 Ops/s $\color{#35bf28}+0.06\%$
test_td3_speed[False-None] 8.3298ms 7.9788ms 125.3326 Ops/s 125.1813 Ops/s $\color{#35bf28}+0.12\%$
test_td3_speed[False-backward] 10.7984ms 10.4198ms 95.9713 Ops/s 95.7880 Ops/s $\color{#35bf28}+0.19\%$
test_td3_speed[True-None] 1.9620ms 1.7239ms 580.0827 Ops/s 576.8256 Ops/s $\color{#35bf28}+0.56\%$
test_td3_speed[True-backward] 3.5417ms 3.3280ms 300.4785 Ops/s 296.8936 Ops/s $\color{#35bf28}+1.21\%$
test_td3_speed[reduce-overhead-None] 1.9326ms 1.7168ms 582.4824 Ops/s 574.3362 Ops/s $\color{#35bf28}+1.42\%$
test_td3_speed[reduce-overhead-backward] 3.3725ms 3.3052ms 302.5574 Ops/s 255.9302 Ops/s $\textbf{\color{#35bf28}+18.22\%}$
test_cql_speed[False-None] 39.2959ms 36.4214ms 27.4564 Ops/s 27.1607 Ops/s $\color{#35bf28}+1.09\%$
test_cql_speed[False-backward] 48.1199ms 46.4111ms 21.5466 Ops/s 21.4835 Ops/s $\color{#35bf28}+0.29\%$
test_cql_speed[True-None] 16.9021ms 15.5216ms 64.4264 Ops/s 64.3059 Ops/s $\color{#35bf28}+0.19\%$
test_cql_speed[True-backward] 24.6025ms 22.0322ms 45.3880 Ops/s 43.9906 Ops/s $\color{#35bf28}+3.18\%$
test_cql_speed[reduce-overhead-None] 16.6446ms 15.6212ms 64.0157 Ops/s 62.2317 Ops/s $\color{#35bf28}+2.87\%$
test_cql_speed[reduce-overhead-backward] 23.3352ms 21.8589ms 45.7479 Ops/s 44.2697 Ops/s $\color{#35bf28}+3.34\%$
test_a2c_speed[False-None] 8.1929ms 7.1454ms 139.9499 Ops/s 136.9198 Ops/s $\color{#35bf28}+2.21\%$
test_a2c_speed[False-backward] 14.8195ms 14.2236ms 70.3057 Ops/s 69.4606 Ops/s $\color{#35bf28}+1.22\%$
test_a2c_speed[True-None] 4.9311ms 4.1732ms 239.6254 Ops/s 237.1166 Ops/s $\color{#35bf28}+1.06\%$
test_a2c_speed[True-backward] 11.0210ms 10.5720ms 94.5893 Ops/s 93.7459 Ops/s $\color{#35bf28}+0.90\%$
test_a2c_speed[reduce-overhead-None] 4.5704ms 4.1729ms 239.6405 Ops/s 236.4280 Ops/s $\color{#35bf28}+1.36\%$
test_a2c_speed[reduce-overhead-backward] 11.7921ms 10.5846ms 94.4769 Ops/s 94.1787 Ops/s $\color{#35bf28}+0.32\%$
test_ppo_speed[False-None] 8.2310ms 7.3595ms 135.8790 Ops/s 132.0242 Ops/s $\color{#35bf28}+2.92\%$
test_ppo_speed[False-backward] 16.2421ms 14.4943ms 68.9929 Ops/s 68.2230 Ops/s $\color{#35bf28}+1.13\%$
test_ppo_speed[True-None] 4.0603ms 3.6695ms 272.5168 Ops/s 268.2309 Ops/s $\color{#35bf28}+1.60\%$
test_ppo_speed[True-backward] 9.8997ms 9.4852ms 105.4273 Ops/s 104.8146 Ops/s $\color{#35bf28}+0.58\%$
test_ppo_speed[reduce-overhead-None] 4.0016ms 3.6590ms 273.2995 Ops/s 268.0089 Ops/s $\color{#35bf28}+1.97\%$
test_ppo_speed[reduce-overhead-backward] 10.3528ms 9.4832ms 105.4498 Ops/s 103.7283 Ops/s $\color{#35bf28}+1.66\%$
test_reinforce_speed[False-None] 7.9564ms 6.4980ms 153.8943 Ops/s 152.4146 Ops/s $\color{#35bf28}+0.97\%$
test_reinforce_speed[False-backward] 10.6512ms 9.6591ms 103.5292 Ops/s 102.6276 Ops/s $\color{#35bf28}+0.88\%$
test_reinforce_speed[True-None] 3.5211ms 2.6310ms 380.0800 Ops/s 375.9591 Ops/s $\color{#35bf28}+1.10\%$
test_reinforce_speed[True-backward] 8.9956ms 8.5094ms 117.5164 Ops/s 116.2352 Ops/s $\color{#35bf28}+1.10\%$
test_reinforce_speed[reduce-overhead-None] 3.1757ms 2.6163ms 382.2135 Ops/s 374.9499 Ops/s $\color{#35bf28}+1.94\%$
test_reinforce_speed[reduce-overhead-backward] 9.3854ms 8.5498ms 116.9618 Ops/s 114.9777 Ops/s $\color{#35bf28}+1.73\%$
test_iql_speed[False-None] 34.7288ms 32.0317ms 31.2191 Ops/s 30.7717 Ops/s $\color{#35bf28}+1.45\%$
test_iql_speed[False-backward] 46.7681ms 44.7530ms 22.3449 Ops/s 22.1895 Ops/s $\color{#35bf28}+0.70\%$
test_iql_speed[True-None] 11.6578ms 10.4254ms 95.9198 Ops/s 93.3187 Ops/s $\color{#35bf28}+2.79\%$
test_iql_speed[True-backward] 22.6085ms 21.2822ms 46.9876 Ops/s 46.4645 Ops/s $\color{#35bf28}+1.13\%$
test_iql_speed[reduce-overhead-None] 11.4332ms 10.4409ms 95.7771 Ops/s 93.7829 Ops/s $\color{#35bf28}+2.13\%$
test_iql_speed[reduce-overhead-backward] 22.1229ms 21.1501ms 47.2811 Ops/s 46.1520 Ops/s $\color{#35bf28}+2.45\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.1037ms 4.8515ms 206.1225 Ops/s 202.7653 Ops/s $\color{#35bf28}+1.66\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8947ms 0.5137ms 1.9466 KOps/s 1.9417 KOps/s $\color{#35bf28}+0.25\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6877ms 0.4843ms 2.0647 KOps/s 2.0313 KOps/s $\color{#35bf28}+1.65\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.7762ms 4.6225ms 216.3342 Ops/s 214.4042 Ops/s $\color{#35bf28}+0.90\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3.1100ms 0.5029ms 1.9883 KOps/s 1.9875 KOps/s $\color{#35bf28}+0.04\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8122ms 0.4760ms 2.1009 KOps/s 2.1170 KOps/s $\color{#d91a1a}-0.76\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.9190ms 1.6233ms 616.0105 Ops/s 600.3984 Ops/s $\color{#35bf28}+2.60\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.2058ms 1.5819ms 632.1663 Ops/s 623.2249 Ops/s $\color{#35bf28}+1.43\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.2030ms 4.7518ms 210.4470 Ops/s 207.7197 Ops/s $\color{#35bf28}+1.31\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.4013ms 0.6423ms 1.5569 KOps/s 1.5417 KOps/s $\color{#35bf28}+0.98\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0655ms 0.6236ms 1.6035 KOps/s 1.6166 KOps/s $\color{#d91a1a}-0.81\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 4.9130ms 4.5922ms 217.7612 Ops/s 211.6235 Ops/s $\color{#35bf28}+2.90\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.4089s 1.0452ms 956.7851 Ops/s 1.9060 KOps/s $\textbf{\color{#d91a1a}-49.80\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6756ms 0.4866ms 2.0551 KOps/s 2.0328 KOps/s $\color{#35bf28}+1.10\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 4.9757ms 4.5552ms 219.5297 Ops/s 212.5650 Ops/s $\color{#35bf28}+3.28\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.9023ms 0.4995ms 2.0021 KOps/s 1.9585 KOps/s $\color{#35bf28}+2.23\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8371ms 0.4827ms 2.0717 KOps/s 2.1219 KOps/s $\color{#d91a1a}-2.37\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.0458ms 4.7712ms 209.5921 Ops/s 205.4175 Ops/s $\color{#35bf28}+2.03\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 3.0124ms 0.6497ms 1.5391 KOps/s 1.5429 KOps/s $\color{#d91a1a}-0.25\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0783ms 0.6270ms 1.5950 KOps/s 1.5918 KOps/s $\color{#35bf28}+0.20\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 5.6914ms 4.1482ms 241.0669 Ops/s 237.0619 Ops/s $\color{#35bf28}+1.69\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 7.6153ms 2.2777ms 439.0375 Ops/s 440.7738 Ops/s $\color{#d91a1a}-0.39\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 4.8610ms 1.2851ms 778.1725 Ops/s 755.7242 Ops/s $\color{#35bf28}+2.97\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.4146s 12.3719ms 80.8286 Ops/s 245.9961 Ops/s $\textbf{\color{#d91a1a}-67.14\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 8.9604ms 2.3218ms 430.6988 Ops/s 410.4523 Ops/s $\color{#35bf28}+4.93\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 4.5607ms 1.3250ms 754.7393 Ops/s 770.1848 Ops/s $\color{#d91a1a}-2.01\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 5.7585ms 4.3116ms 231.9319 Ops/s 242.5355 Ops/s $\color{#d91a1a}-4.37\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 0.3920s 10.2295ms 97.7568 Ops/s 411.0310 Ops/s $\textbf{\color{#d91a1a}-76.22\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 1.8272ms 1.3516ms 739.8502 Ops/s 650.0501 Ops/s $\textbf{\color{#35bf28}+13.81\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 11.1109ms 10.8679ms 92.0143 Ops/s 85.1608 Ops/s $\textbf{\color{#35bf28}+8.05\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 17.7325ms 14.8800ms 67.2044 Ops/s 66.4228 Ops/s $\color{#35bf28}+1.18\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 21.6209ms 19.6973ms 50.7684 Ops/s 49.7207 Ops/s $\color{#35bf28}+2.11\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 17.0217ms 15.0219ms 66.5693 Ops/s 65.4257 Ops/s $\color{#35bf28}+1.75\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 19.8994ms 19.5383ms 51.1815 Ops/s 48.4898 Ops/s $\textbf{\color{#35bf28}+5.55\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 17.8663ms 16.2742ms 61.4470 Ops/s 59.8172 Ops/s $\color{#35bf28}+2.72\%$

[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}19$. Worsened: $\large\color{#d91a1a}11$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.7527s 0.7491s 1.3350 Ops/s 1.3143 Ops/s $\color{#35bf28}+1.58\%$
test_transformed 1.0018s 1.0007s 0.9993 Ops/s 0.9821 Ops/s $\color{#35bf28}+1.75\%$
test_serial 2.1549s 2.1458s 0.4660 Ops/s 0.4568 Ops/s $\color{#35bf28}+2.02\%$
test_parallel 2.0083s 1.9660s 0.5087 Ops/s 0.5038 Ops/s $\color{#35bf28}+0.97\%$
test_step_mdp_speed[True-True-True-True-True] 0.2259ms 39.6976μs 25.1905 KOps/s 24.6856 KOps/s $\color{#35bf28}+2.05\%$
test_step_mdp_speed[True-True-True-True-False] 0.2148ms 22.4886μs 44.4670 KOps/s 43.3488 KOps/s $\color{#35bf28}+2.58\%$
test_step_mdp_speed[True-True-True-False-True] 0.2202ms 21.8366μs 45.7948 KOps/s 44.7970 KOps/s $\color{#35bf28}+2.23\%$
test_step_mdp_speed[True-True-True-False-False] 45.0010μs 12.6867μs 78.8224 KOps/s 77.5495 KOps/s $\color{#35bf28}+1.64\%$
test_step_mdp_speed[True-True-False-True-True] 78.7520μs 41.5622μs 24.0603 KOps/s 23.6489 KOps/s $\color{#35bf28}+1.74\%$
test_step_mdp_speed[True-True-False-True-False] 57.7010μs 24.3724μs 41.0300 KOps/s 39.5062 KOps/s $\color{#35bf28}+3.86\%$
test_step_mdp_speed[True-True-False-False-True] 79.6910μs 23.8189μs 41.9835 KOps/s 40.3497 KOps/s $\color{#35bf28}+4.05\%$
test_step_mdp_speed[True-True-False-False-False] 84.6520μs 14.6975μs 68.0390 KOps/s 66.5749 KOps/s $\color{#35bf28}+2.20\%$
test_step_mdp_speed[True-False-True-True-True] 79.6310μs 44.1427μs 22.6538 KOps/s 22.4178 KOps/s $\color{#35bf28}+1.05\%$
test_step_mdp_speed[True-False-True-True-False] 56.6800μs 26.7631μs 37.3649 KOps/s 36.7564 KOps/s $\color{#35bf28}+1.66\%$
test_step_mdp_speed[True-False-True-False-True] 63.5010μs 24.3448μs 41.0766 KOps/s 40.6124 KOps/s $\color{#35bf28}+1.14\%$
test_step_mdp_speed[True-False-True-False-False] 45.6310μs 14.6560μs 68.2313 KOps/s 66.1650 KOps/s $\color{#35bf28}+3.12\%$
test_step_mdp_speed[True-False-False-True-True] 99.5920μs 45.7029μs 21.8804 KOps/s 21.5214 KOps/s $\color{#35bf28}+1.67\%$
test_step_mdp_speed[True-False-False-True-False] 59.6710μs 29.0345μs 34.4418 KOps/s 34.1836 KOps/s $\color{#35bf28}+0.76\%$
test_step_mdp_speed[True-False-False-False-True] 61.4110μs 25.8389μs 38.7013 KOps/s 38.2723 KOps/s $\color{#35bf28}+1.12\%$
test_step_mdp_speed[True-False-False-False-False] 80.0520μs 16.5236μs 60.5195 KOps/s 58.9394 KOps/s $\color{#35bf28}+2.68\%$
test_step_mdp_speed[False-True-True-True-True] 69.6810μs 44.2849μs 22.5810 KOps/s 22.1417 KOps/s $\color{#35bf28}+1.98\%$
test_step_mdp_speed[False-True-True-True-False] 58.7910μs 26.7230μs 37.4210 KOps/s 36.6671 KOps/s $\color{#35bf28}+2.06\%$
test_step_mdp_speed[False-True-True-False-True] 59.7110μs 27.7516μs 36.0340 KOps/s 35.3252 KOps/s $\color{#35bf28}+2.01\%$
test_step_mdp_speed[False-True-True-False-False] 38.9010μs 16.4259μs 60.8794 KOps/s 60.3753 KOps/s $\color{#35bf28}+0.83\%$
test_step_mdp_speed[False-True-False-True-True] 0.1666ms 46.0289μs 21.7255 KOps/s 21.4812 KOps/s $\color{#35bf28}+1.14\%$
test_step_mdp_speed[False-True-False-True-False] 58.5010μs 29.2432μs 34.1960 KOps/s 34.1402 KOps/s $\color{#35bf28}+0.16\%$
test_step_mdp_speed[False-True-False-False-True] 3.3750ms 29.9278μs 33.4138 KOps/s 33.2237 KOps/s $\color{#35bf28}+0.57\%$
test_step_mdp_speed[False-True-False-False-False] 69.2810μs 18.6094μs 53.7363 KOps/s 53.6137 KOps/s $\color{#35bf28}+0.23\%$
test_step_mdp_speed[False-False-True-True-True] 79.1020μs 48.3533μs 20.6811 KOps/s 20.4935 KOps/s $\color{#35bf28}+0.92\%$
test_step_mdp_speed[False-False-True-True-False] 70.5910μs 31.2340μs 32.0164 KOps/s 31.6246 KOps/s $\color{#35bf28}+1.24\%$
test_step_mdp_speed[False-False-True-False-True] 57.9910μs 30.0938μs 33.2295 KOps/s 33.0468 KOps/s $\color{#35bf28}+0.55\%$
test_step_mdp_speed[False-False-True-False-False] 49.3910μs 18.6527μs 53.6115 KOps/s 53.8724 KOps/s $\color{#d91a1a}-0.48\%$
test_step_mdp_speed[False-False-False-True-True] 81.8610μs 50.0187μs 19.9925 KOps/s 19.8763 KOps/s $\color{#35bf28}+0.59\%$
test_step_mdp_speed[False-False-False-True-False] 67.5810μs 33.3549μs 29.9806 KOps/s 30.3860 KOps/s $\color{#d91a1a}-1.33\%$
test_step_mdp_speed[False-False-False-False-True] 56.0300μs 31.0895μs 32.1652 KOps/s 32.0349 KOps/s $\color{#35bf28}+0.41\%$
test_step_mdp_speed[False-False-False-False-False] 48.9010μs 20.4337μs 48.9387 KOps/s 49.6514 KOps/s $\color{#d91a1a}-1.44\%$
test_values[generalized_advantage_estimate-True-True] 26.9863ms 25.4730ms 39.2572 Ops/s 38.7693 Ops/s $\color{#35bf28}+1.26\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1020s 2.9459ms 339.4578 Ops/s 346.4149 Ops/s $\color{#d91a1a}-2.01\%$
test_values[td0_return_estimate-False-False] 0.1060ms 79.8766μs 12.5193 KOps/s 12.1037 KOps/s $\color{#35bf28}+3.43\%$
test_values[td1_return_estimate-False-False] 56.6824ms 55.9719ms 17.8661 Ops/s 17.4033 Ops/s $\color{#35bf28}+2.66\%$
test_values[vec_td1_return_estimate-False-False] 1.3384ms 1.0920ms 915.7517 Ops/s 909.0599 Ops/s $\color{#35bf28}+0.74\%$
test_values[td_lambda_return_estimate-True-False] 89.2884ms 88.4712ms 11.3031 Ops/s 10.9563 Ops/s $\color{#35bf28}+3.17\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.3876ms 1.0913ms 916.3111 Ops/s 901.6212 Ops/s $\color{#35bf28}+1.63\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 25.6448ms 25.1209ms 39.8075 Ops/s 39.3229 Ops/s $\color{#35bf28}+1.23\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0630ms 0.7580ms 1.3192 KOps/s 1.2759 KOps/s $\color{#35bf28}+3.40\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.8230ms 0.6745ms 1.4826 KOps/s 1.4702 KOps/s $\color{#35bf28}+0.84\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.6840ms 1.4910ms 670.7126 Ops/s 667.6369 Ops/s $\color{#35bf28}+0.46\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.8467ms 0.6876ms 1.4544 KOps/s 1.4311 KOps/s $\color{#35bf28}+1.63\%$
test_dqn_speed[False-None] 7.6034ms 1.5218ms 657.1366 Ops/s 657.0110 Ops/s $\color{#35bf28}+0.02\%$
test_dqn_speed[False-backward] 2.2488ms 2.1158ms 472.6333 Ops/s 465.9080 Ops/s $\color{#35bf28}+1.44\%$
test_dqn_speed[True-None] 0.7553ms 0.5296ms 1.8884 KOps/s 1.8650 KOps/s $\color{#35bf28}+1.25\%$
test_dqn_speed[True-backward] 1.3236ms 1.1926ms 838.5389 Ops/s 816.2363 Ops/s $\color{#35bf28}+2.73\%$
test_dqn_speed[reduce-overhead-None] 0.7118ms 0.5441ms 1.8379 KOps/s 1.7509 KOps/s $\color{#35bf28}+4.97\%$
test_dqn_speed[reduce-overhead-backward] 1.2294ms 1.0563ms 946.7447 Ops/s 891.5057 Ops/s $\textbf{\color{#35bf28}+6.20\%}$
test_ddpg_speed[False-None] 3.2179ms 2.8550ms 350.2666 Ops/s 344.3738 Ops/s $\color{#35bf28}+1.71\%$
test_ddpg_speed[False-backward] 4.4992ms 4.2930ms 232.9371 Ops/s 232.8967 Ops/s $\color{#35bf28}+0.02\%$
test_ddpg_speed[True-None] 1.2407ms 1.0580ms 945.1794 Ops/s 935.0481 Ops/s $\color{#35bf28}+1.08\%$
test_ddpg_speed[True-backward] 2.4385ms 2.2628ms 441.9339 Ops/s 431.8339 Ops/s $\color{#35bf28}+2.34\%$
test_ddpg_speed[reduce-overhead-None] 1.2514ms 1.0687ms 935.7295 Ops/s 923.9514 Ops/s $\color{#35bf28}+1.27\%$
test_ddpg_speed[reduce-overhead-backward] 1.7735ms 1.6268ms 614.6951 Ops/s 551.7659 Ops/s $\textbf{\color{#35bf28}+11.41\%}$
test_sac_speed[False-None] 8.7481ms 8.1010ms 123.4420 Ops/s 121.3842 Ops/s $\color{#35bf28}+1.70\%$
test_sac_speed[False-backward] 11.8857ms 11.1072ms 90.0316 Ops/s 86.2205 Ops/s $\color{#35bf28}+4.42\%$
test_sac_speed[True-None] 1.7871ms 1.5078ms 663.2274 Ops/s 649.8724 Ops/s $\color{#35bf28}+2.06\%$
test_sac_speed[True-backward] 3.5678ms 3.3657ms 297.1107 Ops/s 291.8918 Ops/s $\color{#35bf28}+1.79\%$
test_sac_speed[reduce-overhead-None] 24.0496ms 12.8254ms 77.9700 Ops/s 78.5472 Ops/s $\color{#d91a1a}-0.73\%$
test_sac_speed[reduce-overhead-backward] 1.6616ms 1.5187ms 658.4785 Ops/s 731.3291 Ops/s $\textbf{\color{#d91a1a}-9.96\%}$
test_redq_speed[False-None] 8.4031ms 7.5339ms 132.7334 Ops/s 127.4220 Ops/s $\color{#35bf28}+4.17\%$
test_redq_speed[False-backward] 12.7826ms 11.7649ms 84.9984 Ops/s 85.4369 Ops/s $\color{#d91a1a}-0.51\%$
test_redq_speed[True-None] 2.3384ms 1.9776ms 505.6535 Ops/s 499.3686 Ops/s $\color{#35bf28}+1.26\%$
test_redq_speed[True-backward] 3.9286ms 3.6714ms 272.3732 Ops/s 258.7445 Ops/s $\textbf{\color{#35bf28}+5.27\%}$
test_redq_speed[reduce-overhead-None] 2.7439ms 1.9866ms 503.3775 Ops/s 496.4330 Ops/s $\color{#35bf28}+1.40\%$
test_redq_speed[reduce-overhead-backward] 4.1490ms 3.8132ms 262.2488 Ops/s 254.2801 Ops/s $\color{#35bf28}+3.13\%$
test_redq_deprec_speed[False-None] 9.7529ms 9.0843ms 110.0797 Ops/s 107.1354 Ops/s $\color{#35bf28}+2.75\%$
test_redq_deprec_speed[False-backward] 13.1352ms 12.4088ms 80.5877 Ops/s 79.1388 Ops/s $\color{#35bf28}+1.83\%$
test_redq_deprec_speed[True-None] 2.6156ms 2.3352ms 428.2329 Ops/s 429.1528 Ops/s $\color{#d91a1a}-0.21\%$
test_redq_deprec_speed[True-backward] 4.3349ms 4.1681ms 239.9194 Ops/s 247.7916 Ops/s $\color{#d91a1a}-3.18\%$
test_redq_deprec_speed[reduce-overhead-None] 2.5149ms 2.3134ms 432.2675 Ops/s 428.7623 Ops/s $\color{#35bf28}+0.82\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.6745ms 4.1669ms 239.9859 Ops/s 238.5811 Ops/s $\color{#35bf28}+0.59\%$
test_td3_speed[False-None] 8.0991ms 7.9203ms 126.2584 Ops/s 123.4566 Ops/s $\color{#35bf28}+2.27\%$
test_td3_speed[False-backward] 11.0119ms 10.5451ms 94.8309 Ops/s 93.2612 Ops/s $\color{#35bf28}+1.68\%$
test_td3_speed[True-None] 1.6634ms 1.5557ms 642.8147 Ops/s 648.1404 Ops/s $\color{#d91a1a}-0.82\%$
test_td3_speed[True-backward] 3.3839ms 3.2533ms 307.3771 Ops/s 305.9703 Ops/s $\color{#35bf28}+0.46\%$
test_td3_speed[reduce-overhead-None] 78.4566ms 26.1802ms 38.1968 Ops/s 36.7929 Ops/s $\color{#35bf28}+3.82\%$
test_td3_speed[reduce-overhead-backward] 1.7266ms 1.4621ms 683.9669 Ops/s 674.0928 Ops/s $\color{#35bf28}+1.46\%$
test_cql_speed[False-None] 17.8777ms 16.9744ms 58.9121 Ops/s 57.6977 Ops/s $\color{#35bf28}+2.10\%$
test_cql_speed[False-backward] 23.1085ms 22.5502ms 44.3454 Ops/s 43.9276 Ops/s $\color{#35bf28}+0.95\%$
test_cql_speed[True-None] 3.2038ms 2.9110ms 343.5266 Ops/s 341.8889 Ops/s $\color{#35bf28}+0.48\%$
test_cql_speed[True-backward] 5.6452ms 5.1804ms 193.0365 Ops/s 190.1693 Ops/s $\color{#35bf28}+1.51\%$
test_cql_speed[reduce-overhead-None] 21.8016ms 13.1903ms 75.8135 Ops/s 76.0245 Ops/s $\color{#d91a1a}-0.28\%$
test_cql_speed[reduce-overhead-backward] 1.8276ms 1.6856ms 593.2502 Ops/s 646.8146 Ops/s $\textbf{\color{#d91a1a}-8.28\%}$
test_a2c_speed[False-None] 3.4919ms 3.2224ms 310.3256 Ops/s 301.9008 Ops/s $\color{#35bf28}+2.79\%$
test_a2c_speed[False-backward] 7.0237ms 6.4265ms 155.6054 Ops/s 158.4448 Ops/s $\color{#d91a1a}-1.79\%$
test_a2c_speed[True-None] 1.2412ms 1.0193ms 981.1035 Ops/s 996.2669 Ops/s $\color{#d91a1a}-1.52\%$
test_a2c_speed[True-backward] 2.6938ms 2.5552ms 391.3558 Ops/s 360.3667 Ops/s $\textbf{\color{#35bf28}+8.60\%}$
test_a2c_speed[reduce-overhead-None] 21.7559ms 11.7222ms 85.3083 Ops/s 86.2155 Ops/s $\color{#d91a1a}-1.05\%$
test_a2c_speed[reduce-overhead-backward] 1.1613ms 0.9830ms 1.0173 KOps/s 865.4418 Ops/s $\textbf{\color{#35bf28}+17.55\%}$
test_ppo_speed[False-None] 4.0593ms 3.7453ms 267.0007 Ops/s 264.4585 Ops/s $\color{#35bf28}+0.96\%$
test_ppo_speed[False-backward] 7.3893ms 6.9552ms 143.7774 Ops/s 138.7026 Ops/s $\color{#35bf28}+3.66\%$
test_ppo_speed[True-None] 1.1318ms 0.9470ms 1.0560 KOps/s 1.0528 KOps/s $\color{#35bf28}+0.30\%$
test_ppo_speed[True-backward] 2.6652ms 2.5195ms 396.9038 Ops/s 391.9537 Ops/s $\color{#35bf28}+1.26\%$
test_ppo_speed[reduce-overhead-None] 0.6843ms 0.5056ms 1.9779 KOps/s 1.8883 KOps/s $\color{#35bf28}+4.75\%$
test_ppo_speed[reduce-overhead-backward] 1.0953ms 0.9677ms 1.0334 KOps/s 993.9452 Ops/s $\color{#35bf28}+3.97\%$
test_reinforce_speed[False-None] 2.4588ms 2.2622ms 442.0500 Ops/s 434.7371 Ops/s $\color{#35bf28}+1.68\%$
test_reinforce_speed[False-backward] 3.8014ms 3.3228ms 300.9484 Ops/s 303.8291 Ops/s $\color{#d91a1a}-0.95\%$
test_reinforce_speed[True-None] 1.0413ms 0.8284ms 1.2072 KOps/s 1.1967 KOps/s $\color{#35bf28}+0.88\%$
test_reinforce_speed[True-backward] 2.5798ms 2.3840ms 419.4597 Ops/s 404.5852 Ops/s $\color{#35bf28}+3.68\%$
test_reinforce_speed[reduce-overhead-None] 22.7590ms 11.9027ms 84.0147 Ops/s 89.3452 Ops/s $\textbf{\color{#d91a1a}-5.97\%}$
test_reinforce_speed[reduce-overhead-backward] 1.3693ms 1.0477ms 954.4717 Ops/s 934.5947 Ops/s $\color{#35bf28}+2.13\%$
test_iql_speed[False-None] 9.7592ms 9.3146ms 107.3579 Ops/s 106.1032 Ops/s $\color{#35bf28}+1.18\%$
test_iql_speed[False-backward] 13.7880ms 13.0611ms 76.5633 Ops/s 75.3260 Ops/s $\color{#35bf28}+1.64\%$
test_iql_speed[True-None] 1.9777ms 1.7272ms 578.9784 Ops/s 575.9111 Ops/s $\color{#35bf28}+0.53\%$
test_iql_speed[True-backward] 4.3666ms 4.1655ms 240.0672 Ops/s 227.1143 Ops/s $\textbf{\color{#35bf28}+5.70\%}$
test_iql_speed[reduce-overhead-None] 15.4475ms 8.9532ms 111.6915 Ops/s 86.5503 Ops/s $\textbf{\color{#35bf28}+29.05\%}$
test_iql_speed[reduce-overhead-backward] 1.4722ms 1.4347ms 696.9964 Ops/s 624.8813 Ops/s $\textbf{\color{#35bf28}+11.54\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 8.0753ms 6.4497ms 155.0466 Ops/s 151.6534 Ops/s $\color{#35bf28}+2.24\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7017ms 0.3690ms 2.7099 KOps/s 2.6736 KOps/s $\color{#35bf28}+1.36\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6140ms 0.3589ms 2.7866 KOps/s 3.1414 KOps/s $\textbf{\color{#d91a1a}-11.29\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.6837ms 6.2195ms 160.7856 Ops/s 158.3817 Ops/s $\color{#35bf28}+1.52\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.1425ms 0.3207ms 3.1184 KOps/s 2.9894 KOps/s $\color{#35bf28}+4.32\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6575ms 0.2672ms 3.7420 KOps/s 3.2612 KOps/s $\textbf{\color{#35bf28}+14.74\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6733ms 1.4287ms 699.9290 Ops/s 652.4705 Ops/s $\textbf{\color{#35bf28}+7.27\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.6784ms 1.3784ms 725.4672 Ops/s 678.2844 Ops/s $\textbf{\color{#35bf28}+6.96\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.7231ms 6.3882ms 156.5393 Ops/s 154.4229 Ops/s $\color{#35bf28}+1.37\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.8737ms 0.4895ms 2.0430 KOps/s 2.2730 KOps/s $\textbf{\color{#d91a1a}-10.12\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8063ms 0.4776ms 2.0937 KOps/s 2.4184 KOps/s $\textbf{\color{#d91a1a}-13.43\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.5341ms 6.2349ms 160.3864 Ops/s 158.0110 Ops/s $\color{#35bf28}+1.50\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.9846ms 0.3946ms 2.5342 KOps/s 3.4643 KOps/s $\textbf{\color{#d91a1a}-26.85\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5752ms 0.3290ms 3.0397 KOps/s 3.7272 KOps/s $\textbf{\color{#d91a1a}-18.45\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.6510ms 6.1938ms 161.4523 Ops/s 158.5157 Ops/s $\color{#35bf28}+1.85\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.7985ms 0.2672ms 3.7426 KOps/s 2.6261 KOps/s $\textbf{\color{#35bf28}+42.52\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6529ms 0.3386ms 2.9533 KOps/s 2.6991 KOps/s $\textbf{\color{#35bf28}+9.42\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.6744ms 6.3510ms 157.4551 Ops/s 153.6901 Ops/s $\color{#35bf28}+2.45\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.8953ms 0.4201ms 2.3802 KOps/s 2.1670 KOps/s $\textbf{\color{#35bf28}+9.84\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6717ms 0.3969ms 2.5195 KOps/s 2.3812 KOps/s $\textbf{\color{#35bf28}+5.81\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.7869ms 5.3084ms 188.3817 Ops/s 179.3323 Ops/s $\textbf{\color{#35bf28}+5.05\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 3.9146ms 1.9084ms 524.0025 Ops/s 497.1301 Ops/s $\textbf{\color{#35bf28}+5.41\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 8.3392ms 1.2580ms 794.8996 Ops/s 793.3786 Ops/s $\color{#35bf28}+0.19\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 7.5099ms 5.4073ms 184.9352 Ops/s 181.9477 Ops/s $\color{#35bf28}+1.64\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 8.3438ms 2.0413ms 489.8938 Ops/s 414.6437 Ops/s $\textbf{\color{#35bf28}+18.15\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 0.5536s 12.3199ms 81.1695 Ops/s 813.6321 Ops/s $\textbf{\color{#d91a1a}-90.02\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 7.4457ms 5.6009ms 178.5438 Ops/s 30.3584 Ops/s $\textbf{\color{#35bf28}+488.12\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 10.6181ms 2.1923ms 456.1343 Ops/s 481.3833 Ops/s $\textbf{\color{#d91a1a}-5.25\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 7.2350ms 1.4094ms 709.5013 Ops/s 750.7028 Ops/s $\textbf{\color{#d91a1a}-5.49\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.4584ms 13.0431ms 76.6689 Ops/s 75.1487 Ops/s $\color{#35bf28}+2.02\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 19.2862ms 17.1029ms 58.4697 Ops/s 56.3801 Ops/s $\color{#35bf28}+3.71\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.5754ms 17.6141ms 56.7726 Ops/s 54.0983 Ops/s $\color{#35bf28}+4.94\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 19.4294ms 17.3273ms 57.7124 Ops/s 55.7816 Ops/s $\color{#35bf28}+3.46\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 17.7666ms 17.3991ms 57.4742 Ops/s 54.8673 Ops/s $\color{#35bf28}+4.75\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 20.1449ms 18.5793ms 53.8233 Ops/s 52.1904 Ops/s $\color{#35bf28}+3.13\%$

[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
@vmoens vmoens merged commit 105440a into gh/vmoens/40/base Dec 14, 2024
51 of 66 checks passed
vmoens added a commit that referenced this pull request Dec 14, 2024
ghstack-source-id: ddc131acedbbe451b28758e757a8c240ebd72b80
Pull Request resolved: #2569
@vmoens vmoens deleted the gh/vmoens/40/head branch December 14, 2024 00:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants